DOCOMEHT BESUHE 

ED 095 000 SE 017 719 



AOTHOB 
TITLE 

PUB DATE 
NOTE 



Herron, J. Dudley; And Others 

The Proper Experimental Unit: Comparative Analyses of 
Empirical Data. 
Apr 74 

16p.; Paper presented at the annual meeting of the 
National AssociatiocT for Research in Science Teaching 
(U7th, Chicago/ Illinois, April 197UJ 



EDES PRICE 
DESCRIPTORS 



IDENTIFIERS 



I!F-$0.75 HC-$1.50 PLUS POSTAGE 

Chemistry; College Science; College Students; 

Educational Objectives; ^Educational Research; 

^Research Design; Research Problems; Science 

Education 

Research Reports 



ABSTRACT 

Reported is a discussion concerning the effect of 
different analyses of the same empirical data. Students (N=over 200) 
enrolled in an introductory college chemistry course for science 
majors were randomly assigned to two experimental treatments. 
Treatment O vas designed to help students understand the objectives 
of the course and to emphasize the importance of the objectives. 
Students in treatment R also received a list of objectives for the 
course but emphasis vas on providing feedback concerning their 
progress toward meeting these objectives via a weekly 10 point quiz. 
Half the students in each class in treatment R were told they h^d to 
score at least 8 points on the quiz or they would receive a zero 
grade for that quiz. They could, however, re*take the quiz as often 
as they wished in order to achieve a score above 8 points. The other 
students in the class could also re-take the quizzes but they were 
under no coercion concerning their scores. For treatment 0, the 
appropriate experimental unit is the class section but the 
appropriate experimental unit for treatment R is not clear. The 
opportunity to re*take a quiz appeared to indicate that the 
appropriate experimental unit, for R, was the individual. (PEE) 
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INTRODUCTION 



If there is such a thing as a "typical" study in education it would be 
described as a study in which two or more existing classrooms are selected from 
some convenient population of possible classes (usually classrooms within a 
suitable driving distance of a particular university), demographic data are ob- 
tained which suggest that these classes are reasonably comparable, some kind of 
experimental treatment is administered to half of these classes (usually, but 
not always, selected more or less at random), and the performance of the 
"experimental" and "control" groups are compared on some criterion measure. 

In such a design, the treatment is assigned at random to classes (or per- 
haps even to groups of. classes taught by the same teacher) rather than to indi- 
viduals within classes* It has been pointed out that the appropriate experimen- 
tal unit in such studies is the claso rather than the individual and that the 
data should be analyzed using class means as the raw data rather than individual 
scores on the criterion measure. (Raths, 1967). Still, most data analyses in 
science education studies are based on the use of the individual as the experi- 
mental unit . ^ There are several possible explanations for this. First, many 

^Inquiries regarding this paper should be directed to Dr. J. D. Herron, Dept. of 
Chemistry, Purdue Univercity, W. Lafayette^ Indiana 47907 

^In Vol.10, Nos.1-3 of JRST there are 10 studies for which there may be a question 
concerning the appropriate experimental unit. Of these, 8 used the individual 
while 2 used the class. 
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researchers are simply Ignorant of the theoretical arguments for using the 
class as the experimental unit. Second, these experimenters want to obtain 
significant results and feel that the reduction in degrees of freedom which 
result from using the class rather than the individual as the experimental unit 
will reduce the chance of getting these differences. Third, some researchers, 
perhaps bolstered by the arguments of Fletcher (Tletcher, 1968), are not con- 
vinced that using group means as the experimental unit is a better procedure 
than using the individual score. 

Whatever the reasons and regardless of the soundness of the arguments, the 
reader of research is faced with the fact that many of the studies which may 
interest him have been conducted under circumstances for which there is some 
question concerning the appropriateness of the experimental unit used in the 
analysis. What does one do? Should one ignore results and conclusions based 
on analyses in which the wrong experimental unit was used in the analysis? 
Should the results be accepted without question? These are the issues with 
which we are concerned. 

We do not claim that we have the answers to the questions that we have 
raised. However, we do have data, based on one empirical study, which we think 
shed some light on these questions. We believe these data suggest that the 
problem may be less serious than some would suggest, that interpretations of 
data are likely to be similar (though certainly not identical) irrespective of 
the treatment, and that one may err more by dismissing a study out of hand be- 
cause an incorrect choice of experimental unit was made rather than accepting 
the results as "probably correct." 

THE NATURE OF THE STUDY 
The concern of this paper is with the effect of different analyses of the 
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same empirical data* But It Is necessary, first, to describe the nature of the 
study on which these analyses were performed. 

The study grew out of a concern over the effect of providing behavioral 
objectives to college students. An earlier study had suggested that it made 
little difference whether students in a beginning college chemistry course were 
given lists of behavioral objectives, (Herron, 1971). Those students who were 
given the lists seemed to do little better than those who did not have them* 
This result, being contrary to popular educational bias, started a search for 
the reason. 

One explanation entertained by the authors was that the students did not 

2 

really understand what the lists were saying. It was decided that a treatment 
which*would "explain" the objectives to students should be included in the study 
under discussion here. 

Another possible explanation for the earlier observation that lists of ob- 
jectives had little effect on college chemistry students was that the students 
simply did not make use of what they had - that they needed a little "coercion" 
to get them to use the objectives. It was decided to provide some form of 
"coercion" in this study. 

The study was conducted in an introductory college chemistry course for 
science majors. Over 200 students were enrolled in the course. All students 
met for a large lecture session twice a week and met once for recitation and 
once for a three -hour laboratory. Students had the same graduate assistant for 
lab and recitation. There were a total of twelve of these small sections in the 
course with enrollment ranging from a low of 14 to a high of 22, with a mean 
enrollment of 18. The two experimental treatments were assigned at random to 
2 

This hjrpothesis grew out of some unpublished work by Herron and Hiscox which 
suggested that students had difficulty in matching objectives with test items 
over the objectives. 
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class sections, so that there were three sections represented in each of the 

four cells of the 2x2 factorial design. 

The two treatments in the design consisted of the following: 

1« Treatment 0: This treatment was intended to help students understand the 
objectives of the course and to emphasize the importance of the objectives* 
Students in a class section receiving this treatment found that each of their 
recitation classes was organized around the list of objectives for the week. 
The class was conducted by going over the list of objectives, trying to deter- 
mine which objectives the students were having difficulty with and providing 
help with these objectives. In most instances, the help would be in the form 
of a referral to one or more of the homework problems which were related to 
the objective. 

If a class section did not receive Treatment 0, the recitation session was 
used for studying assigned homework. Objectives were not mentioned unless a 
student asked a specific question about them. It should be emphasized that 
all students had a list of objectives for the course. These were handed out 
in the laboratory each week. Treatment 0 simply represents a difference in 
the attention given to the lists of objectives during the recitation session. 
2. Treatment R: A quiz was given to all students during the first half-hour of 
each laboratory session. This quiz was related to the objectives for the 
previous week and was scored on a 10 point basis. The primairy purpose of 
the quiz was to provide students with feedback concerning their progress 
toward meeting the objectives of the course. In order to provide a "coercion" 
treatment for some individuals in the course, students in half the class sec- 
tions were told that they must either score at least 8 points on the weekly 
quiz or receive a grade of zero for that quiz. If they scored below 8 points 
they had an opportunity to re -take the quiz as many times as they liked but 
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their score would remain zero until they scored above 8 points; thus, 
possible scores for students in this treatment were 10, 9, 8 and 0« 
Students in other class sections were given the same weekly quiz and the 
same opportunity to re-take the quiz as many times as they wished. However, 
there was no attempt to coerce them to do so. These students received what- 
ever score they made on the quiz at the first administration or, if they 
took the quiz over, the score that they received the second time, be it 
higher or lower • Thus, a student under the "0" treatment who scored below 
8 points - a 7, for example - had nothing to lose by re-taking the quiz and 
8 points to gain. A student who was not in the treatment group and who 
scored 7 points had little to^ gain (a maximum of 3 points) by re-taking the 
quiz and he could possibly lose since his final score for the quiz would be 
that which he obtained on the last administration of the quiz. 

In summary, the basic 2x2 factorial design consisted of random assignment of 

class sections to each of four cells represented in Table I« 



Table I 
Treatment 0^ 

0 o 
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Ic 


lb 


r 






e R 


3b 


4 a 


a 

t 


6b 


6a 


m 






e 


la 


2b 


n 

t 


2a 


3a 


R* 


4b 


6c 



*Upper case letters represent cells containing class sections that 
received the indicated treatment; lower case letters represent those 
cells for which the indicated treatment was absent. 



-6- 



THE EXPERIMENTAL UNIT 

Now that the design of the study has been briefly outlined, we turn to the 
question of the appropriate experimental unit. It seems clear that for Treat* 
ment 0, the appropriate experimental unit is the class section. The treatment 
is administered to the section as a whole and it is very likely that interactions 
among students within the class constitute an important part of the treatment. 
Thus, the appropriate analysis of the data would be to treat the means for each 
class section as a ''score*' for that section and conduct the analysis of variance 
accordingly. 

The appropriate experimental unit for Treatment R is not as clear. Although 
Treatment R was assigned to class sections at random rather than to individuals, 
the treatment itself is essentially an individual treatment. Each individual 
student decided whether he wanted to repeat a quiz or receive a grade of zero. 
There is little reason to believe that the choices of others in his class sec- 
tion would have any important influence on his decision since everyone who re- 
peated the quiz did so individually and outside of class time. Although the 
reader may disagree, we are inclined to say that the appropriate experimental 
unit for Treatment R is the individual. 

Partially as a result of the "mixed" nature of our experiment, but primarily 
out of curiosity, it was decided that the data from this study would be analyzed 
in several ways. These various analyses are summarized in Table II. The sim- 
plest of these analyses (represented by la in Table II) consists of a conven- 
, t^ional two-way ANOVA us^ng the individual as the experimental unit. The second 
analysis in the table (lb) is identical to the first witli the exception that the 
section mean is used as the experimental unit. These two analyses, using various 
measures as the criterion variable, provide an opportunity to compare the con- 
clusions that would result from the same data vh'en either the individual or the 
class is considered as the experimental unit. 

ERLC 
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Table 11^ 
Types of Analyses Performed 



Type of Criterion Experimental 3ird Co- 
Analysis Measure Unit Factor Variates 

la A Individual Score 

lb I Section Mean --- 

Ila Individual Score V-M 

lib or Section Mean V-M (ok^an) 

Ilia Individual Score M 

Illb Split Section Mean M (rm) 

IVa Individual M Qrep 

IVb ^ Split Section Mean M (rm) Qrep 



*For an explanation of the sjnnbols used in this table, refer to the legend 
accompanying Table VIII, page X4. 

Since there is always the question of the comparability of class groups, 
other analyses were performed in which SAT-M and SAT-V scores were used either as 
CO variates or as a third factor in the factoilal design. For example, analysis 
Ila in Table II is identical to analysis la with the exception that the verbal 
and math scores on the SAT exam have been used as covariates to statistically 
adjust the scores on the criterion measure for differences in ability that might 
have existed between treatment groups. Analysis lib parallels lb and is equi- 
valent to analysis Ila with the exception that the claos is treated as the ex- 
perimental unit. The scores used for the covariate .adjustment are the mean 
SAT-V and the mean SAT-M for the class section. The next pair of analyses shown 
in Table II (Ilia and Illb) represent those for which the SAT scores are used as 
a third factor in a 2 x 2 x 3 factorial design. In analysis Ilia, the sample of 
students was stratified into a high, average, and low SAT group and the ANOVA 
was done to determine if there was a main effect due to "ability" as measured uy 
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the SAT-M test. It should be noted tha*: analysis Illb Is not exactly parallel 
to Ilia, Since the section means represent contributions from high, average, and 
low ability students, an analysis in which there Is a stratification of SAT sec- 
tion means does not make any eense. The analysis which was done Is one suggested 
b'^ Page. (Raths, 1967), in which one stratifies each section into a high, average 
and low aliillty (SAT in our case) grcupB calculates both the mean SAT and the 
mean on the criterion measure, and then treats the data as though they are re- 
peated measures of the section mean on the criterion measure when the section 
has a high SAT., when the same section has an average SAT, and when the same sec - 
tion has a low SAT score ^ This appears to be the nearest equivalent to analysis 
Ilia when one wif^hes to use the class as the experimental unit« 

Still a fourth kind of analysis is represented by analyses IVa and IVb. 
This pair is identic^l to Ilia and Illb with the exception that now both a co- 
variate and a third factor are added in the analysis. 

We have summrized the kinds of «inalyses which were performed. There were, 
in fact, several analyses of each tvpe carried out. These differed in what was 
used as the criterion measure. /- total of over forcy analyses of variance and 
covariance were performed of which 24 are presented in this paper, In the analy- 
sis .^hich we are presenting, only two criterion measures are used. One is tha 
total number of points (see legend for Table VIII) that the student accianulated 
in the course and the second is the number of quizzes on which the student scored 
eight or ahove. The I'ationale for choosing these as criterion -neasures was quite 
simple. It was assumed th'^t the total points acc iimulated ir the course was like- 
ly to be the most sensitive measure of student achievement in the course and we 
were primarily interested in knowing how our treatments would affect achievement. 
We first selected "number of quizzes over eight" as a criterion measure to see 
if we did in fact have an "R" treatment. If our admonition to repeat quizzes on 
which a scox^e of less than eight points was obtained was taken seriously, then 
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we should certainly see a "main offset" for Treatment R when "number of qulz2e8 
over eight" in taken an the criterion measure, later, this criterion measurt 
proved useful in comparing the interpretations that would be drawii from the re<- 
sults of the various analyses described in Table II, 

Table III summarizes the means and standard deviations for each class sec- 
tion on each of the criterion variables and the two covariates. Adjusted means 
for the vax*lous covarlate analyses are not given since there were a nuniber of 
such aaalyses, each resulting in a different adjusted mean. There is no simple 
way to present all of these data aud, In the interest of brevity, all have been 
omitted. Tabic III also shows the nunber of indlvi luals in each class section. 
Nunibers In pereutheses represent the number of individuals for which complete 
data were available. 

With over 24 different analyses of variance performed as a part of this study, 
it would consume a consldei.able amount of space vo present all of the ANOVA tables 
in this papex*. However, representative tables are presented. Tables IV-VII 
show the ANOVA tables for analyses Ktype Ie>, 10(Lype lib), 14(typp Illb) ^ and 
?3(e3'p<j IVa) respectively. 

Table VIII sumnarizes the various analyses that were performed and the re- 
sults that were obtained. By comparing the results of various analyses, Aome 
light is shed on the question of the importance of the correct choice of experi- 
mental unit in the.-^e analyses. During the discussion of this paper, attention 
will be focused on the following comparisons: 

These analyses are of Type I (lee Table II). The first number of 
each pa5r treats the individual as the experimental unit while 
the sjcond member of the pair treats the class as the experimental 

1 vs 2 unit. In the analyses in which the n^tmber of quizzes with a:ores 

of 8 or over (Q08) is used as the criterion, it is seen that there 

3 vs 4 is no difference in the interpretation that would be made, regard- 
less of the experimental unit. However, in analyses 3 and 4 where 
total points (XP) accumulated in the course is used as the cri- 
terion, it appears that use of the individual as the experimental 
unit may result in a conclusion that there is a sl(piificant 
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interaction between treatments R and 0, whereas no significant 
differencep are found when the division is used as the exoeri- 
mental unit. 



These analyses differ from the previous ones in that a co- 
variate has been added (Type II cf Table II), In three of the 
four pairs, the conclusions that would be drawn when the Indi- 

5 vs 6 vidual is used as the experimental unit are identical to con- 
clusions that would be drawn when the class is treated as the 

7 vs ti experimental unit • The exception is in analysis 6 where a 

significant main effect due to treatment R is indicated. It 

9 vs 10 should be noted that the discrepancy between analyses 5 and 

6 favors a significant difference in the case where the class 

11 vs 12 is the experimental unit; the discrepancy between analyses 3 
and 4 favors a significant difference in the case where the 
individual is the experimental unit. 



These analyses nave a third factor In the design but no co- 

variate, (Type III In Table II). Once more, in three of the 
13 VQ 14 four comparisons, the results of the analyses are essentially 

the same. However, in analysis 13, a significant interaction 
IS vfi 16 ir detected which does not show in analysis 14. The analysis 

which appears to be more powerful Is the one which utilizes 
17 vs 18 the individual as the experimental unit. Some readers may wish 

to question whether these comparisons arc really parallel since 
19 vs 20 one member of each pair involves stratification on the SAT 

score while the o*'her member of the pair is a rather strange 

repeated measurep analysis . 



These analyses have both a third factor and a covariate in the 
design, (Type IV in Table II). The discrepancies in these 
analyses are the most distressing. In analysis 22, which uses 
the class as the experimental unit, a main effect which is 

21 vs 22 significant at the .0003 level appears but there is no com- 
parable effect seen In the analysis which utilizes the indi- 

23 vs 24 vidual ad the experimental unit. In spite of the very large 

F, che authors are inclined to ^*»lieve that this is a spurious 
result. Although not likely due to chance, the difference is 
not likely due to treatment either. The explanation of the 
anomaly will be discussed. 



ERLC 



• CO 

8 

N 



00 
<N 



SO 



M 



00 ^ 



00 

• • 



SO 



CM 



CM 

o 



CM 
00 



as 



as 

00 



o 
o 



00 



as 



00 
00 



§1 

o 

G 



s 
§ 

H -H 
H 4i 
H U 
O 

CD CO 
H 

•a -g 

Id 



Q 

CO 

c 

10 



ERIC 



(0 

N 



a 

C Q 

•H • 

O CO 



I 



21 



nfififififi ^ to CM to ^ 



^oinoo^cnsoooo0>iHiHo\ 
d so in ^ CM vo ^ 00 vo ^ 



CO 

1 


10.7 


00 
• 

in 


as 
• 


• 

so 


so 

m 

m 


00 
• 


• 

00 


in 
• 

00 


• 


in 
• 
vo 


iH 
• 


in 
• 

00 


SAT 
Mean 


so 


o 


r* 




H 


o 


so 


r* 


iH 


o 


m 


r* 


00 

in 


in 
in 


so 
in 


in 


m 


as 
in 


in 


so 
in 


o\ 
in 


CM 

in 


as 
in 


so 
in 


CO 

■ 


o 

• 

•H 


in 
• 


as 
• 


00 

• 

as 


CM 
• 


cn 
• 


• 


H 

• 


vo 
• 

as 


00 

• 

00 


vo 
• 


as 
• 


SAT • 
Mean 


00 




so 


•H 


00 


CO 


in 


H 




r> 


o\ 


in 






in 




00 


o 
in 


in 


O 

m 


in 


vo 


o 
in 


r* 



so 










cn 


H 








cn 


r* 


o 


<n 


00 




vo 


o\ 


O 


vo 


H 


•H 


CM 


vo 


r* 




as 


o\ 


o\ 


in 


in 


vo 


as 


00 


r* 


so 


CM 






o 


m 


in 


vo 


H 


«n 


o\ 


CM 


cn 




o 


so 


in 


CM 


in 




r* 




o 


O 


in 






so 




00 


o 




o 


8 


in 




o 












in 




in 


in 






in 










8 


2 


























g 


S 








8 


vo 


00 




o 


cn 


to 


o 




CM 




00 


in 


o 


o 


00 


o 






vo 




as 


00 


CM 


as 


CM 


CM 


•H 


CM 


H 


•H 


H 


H 




•H 


CM 


•H 




m 


u 


< 






A 


< 


0) 


< 


n 


O 






•H 




CM 




cn 






VO 


vo 


vo 



-TT- 



CM w 



Id • 



-12- 



TABLE IV 
AHOVA T>bl« for Analyli 1 



aoVkCS ox ▼•rxAuvO 






if c 


£ 


TraatMiit R 


1 


147.221 


147.221 


14.376** 


TreatMnt 0 


1 


.147 


.147 


.014 


R X 0 


1 


43.465 


43.465 


4.244* 


Realdtt«l 


166 


1699.936 


10.241 




Total 


169 


1890.768 







p < .01 
* p ^- .05 



TABLE V 
AWOCVA Table for Analysis 10 



Source of Varlnca 

Treatment R 
Treatment 0 
R X 0 
Realdual 

Total 



S.S. 

9.847 
1017.289 
409.339 
1773.640 
3210.115 



Me S e 

9.847 
1017.289 
409.339 
295.607 



.033 
3.441 
1.385 
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TABLB VI 
AMOVA (RBPKATgP MBASURES) 
TABLB for Aaalyal* 14 



Sourca of Varlonctt 


D.F. 


S.S. 


Ma S a^ 


F 


MMn 


1 


8227278.7445 


8227278.7445 


10043.4870 


TroatoMint R 


1 


1187.0322 


1187.0322 


.619 


Troataent 0 


1 


2445.9619 


2445.9619 


1.275 


8AT-Math 


2 


28076.0950 


14038.0475 


17.1370** 


R X 0 


1 


502.6564 


502.6564 


.262 


R X M 


2 


1768.5114 


884.2557 


1.0795 


0 X M 


2 


3891.2655 


1945.6328 


2.3751 


U(RO)* 


8 


15350.2003 


1918.7750 


2.3424 


ROM 


2 


1387.5300 


693.7650 


.8469 


MU(RO)* 


16 


13106.6491 


819.1656 





*Hota that In tha rapaated ttaaaures with a alxad modal, tha aatiuta of tha error 
varlanca la f (RO) for varlablaa R and 0 but the beat aatlaata la MD(RD} for variable M. 

**p - .01 



TABLB VII 
AMOCOVA Table for Analyala 23 



rce of Variance 


P.P. 


S.S. 


Ma S^a 


P 


Traataent R 


1 


4.064 


4.064 


.547 


Traatnent 0 


1 


4.410 


4.410 


.593 


SAT-Math 


2 


19.480 


9.740 


1.310 


R X 0 


1 


66.760 


66.760 


8.979** 


R X M 


2 


10.561 


5.280 


.710 


0 X M 


2 


13.161 


6.580 


.885 


R X 0 X M 


2 


16.512 


8.256 


1.110 


Reaidual 


157 


1167.361 


7.435 




Total 


168 


1302.308 







**p * .01 
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TABLE VIII 







Stsmmarv of Analyses and Res 








CODE 






RESULTS 


Analysis 
Number 


Crite- 
rion 


Experi- 
mental 
Unit 


3rd 
Factor 


Co- 
Variates 




1. 


Q08 


SUB 






R: F = 14.4; p « .0002 
RxO: F=4.2: p=.04 


2. 


QOB 


DIV 








R: F = 61.2; p = .0001 
RxO: F = 18.9; p = .002 


3. 


TP 


SUB 


... 





RxO: F = 3.9; p = .05 


/, 


J r 


DIV 








No Slgnlilcant Differences 


e 

D m 




SUB 


— 


Qrep 


RxO: F = o.o; p=.004 






DIV 





Qrep 


R: F = /.o; p - .03 
RxO: F 17.8; p = .004 


1. 


TP 


SUB 


... 


V 


RxO: A=4.4; p = .04 


8. 


TP 


T\'X\l 

DiV 




V 


RxO: F = 5.6; p" .05 


9. 


TP 


SUB 




VM 


No Significant Differences 


10. 


TP 


DIV 





VM 


No Significant Differences 


11* 


TP 


SUB 


-— 


M 


No Significs»nt Differences 


12. 


TP 


DIV 





M 


No Significant Differences 


13. 


TP 


SUB 


M 


--- 


M: F = 7.2; p = .007 
RxO: F = 3.7; p= .05 


14. 


TP 


DIV 


M(rm) 




M: F - 17.1; p = .0001 


15. 


TP 


SUB 


V 




V: F - 13.4; p = .0003 


16. 


TP 


DIV 


V(rm) 




V: F = 4.7; p = .02 
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17. 


(JOB 


SUB 


M 




R: F-12.7; p = .001 
RxO: Fir4.3; p« .04 


ifi 


on ft 


TiTM 






RxO: F = 12.3; p= .008 


19. 


Q08 


. SUB 


V 


... 


R: F = 16.6; p « .001 
Rx 0: F«4.0; p = .05 


20. 


Q08 


DIV 


V(rm) 





R: F = 78.2; p= .0000? 
RxO: F = 10.9; p= .01 


21. 


Q08 


SUB 


V 


Qrep 


RxO: F =8.5; p= .005 


22. 


Q08 


DIV 


V(rm) 


Qrep 


R: F=.35.3; p -= .0003 
RxO: F=12,9; p= .007 


23. 


Q08 


SUB 


M 


Qrep 


RxO: F = 9.0; p « .005 


24. 


Q08 


DIV 


M(rm) 


Qrep 


RxO: F= 10.3; p= .02 



Legend 



Q08 indicates that the criterion measure for this analysis was the number 
of qui;5 scores equal to or greater than eight. 

TP indicates that the criterion measure for this analysis was the total 

number of points earned in the course. 

SUB indicates that the individual subject was treated as the experimental 
unit in this analysis. 

DIV indicates that the mean of the scores for individuals within a division 
(class section) was treated as the experimental unit in this analysis. 

V represents the verbal score on the Scholastic Aptitude Test. When V is 

shown in the column headed "3rd Factor", it indicates that the sample 
was stratified into high, average, and low thirds on the basis of SAT- 
verbal score. When V appears in the column headed "Co-variates, " it 
indicates that SAT -verbal scores were used as a covariate in the 
analysis . 

M represents the mathematics score on the Scholastic Aptitude Test. 

(rm) Where this symbol follows V or M» it indicates that the analysis in- 
volved a repeated measures analysis rather than stratification on the 
variable. 
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Leeend (continued) 

Qrep refers to the number of quizzes which were repeated. (Students were 
allowed to repeat a quiz as many times ai^ desired but the score re- 
corded was the last score obtained.) 

R represents the main effect of treatment R. (See page 4 for a des- 

cription of the treatment.) 

RxO represents an interaction between treatment R and treatment 0. (See 
page 4 for a description of the treatments.) 

Under the RESULTS column, each row represents a result which was statistically 
significant at the 0.05 level or beyond. In each row, the first letter repre- 
sents the factor in the analysis which produced the significant F. This is 
followed by the value of F and the probability that an F of that value would 
occur by chance alone. 

Sample Interpretation : 

Refer to the row representing analysis number 24. In this analysis the 
number of quizzes with scores of eight or more was the criterion measure. The 
division (class section) was treated as the experimental unit, i.e. the "scores" 
treated in the statistical analysis were division means rather than individual 
scores. The analysis of variance utilized a 2x2x3 factorial design with 
treatments R and 0 as the first two factors. The "third factor" was a repeated 
measures using mean SAT -math scores for the high third, middle third, and low 
third of a division as the repeated measure. The number of quizzes repeated by 
the students were used as a covariate. This analysis produced no significant 
main effects. There was a significant RxO Interaction which produced an F of 
10.3. This value of F would occur by chance about 2 times in 100. 



